TechMiner: Extracting Technologies from Academic Publications
نویسندگان
چکیده
In recent years we have seen the emergence of a variety of scholarly datasets. Typically these capture ‘standard’ scholarly entities and their connections, such as authors, affiliations, venues, publications, citations, and others. However, as the repositories grow and the technology improves, researchers are adding new entities to these repositories to develop a richer model of the scholarly domain. In this paper, we introduce TechMiner, a new approach, which combines NLP, machine learning and semantic technologies, for mining technologies from research publications and generating an OWL ontology describing their relationships with other research entities. The resulting knowledge base can support a number of tasks, such as: richer semantic search, which can exploit the technology dimension to support better retrieval of publications; richer expert search; monitoring the emergence and impact of new technologies, both within and across scientific fields; studying the scholarly dynamics associated with the emergence of new technologies; and others. TechMiner was evaluated on a manually annotated gold standard and the results indicate that it significantly outperforms alternative NLP approaches and that its semantic features improve performance significantly with respect to both recall and precision.
منابع مشابه
Combining NLP And Semantics For Mining Software Technologies From Research Publications
The community of natural language processing (NLP) has developed a variety of methods for extracting and disambiguating information from research publications. However, they usually focus only on classic research entities such as authors, affiliations, venues, references and keywords. We propose a novel approach, which combines NLP and semantic technologies for generating from the free text of ...
متن کاملSmart Topic Miner: Supporting Springer Nature Editors with Semantic Web Technologies Conference Item Smart Topic Miner: Supporting Springer Nature Editors with Semantic Web Technologies
Academic publishers, such as Springer Nature, annotate scholarly products with the appropriate research topics and keywords to facilitate the marketing process and to support (digital) libraries and academic search engines. This critical process is usually handled manually by experienced editors, leading to high costs and slow throughput. In this demo paper, we present Smart Topic Miner (STM), ...
متن کاملLogical Structure Analysis of Scientific Publications in
Even though the Linking Open Data cloud is constantly growing, there is a serious lack of published data sets related to the domain of academic mathematics. At the same time, since most scholarly publications in mathematics are well-structured and conventional, it’s promising to get their helpful detailed representation. The paper describes an approach to extracting and analyzing the structure ...
متن کاملPatient Engagement as an Emerging Challenge for Healthcare Services: Mapping the Literature
Patients' engagement in healthcare is at the forefront of policy and research practice and is now widely recognized as a critical ingredient for high-quality healthcare system. This study aims to analyze the current academic literature (from 2002 to 2012) about patient engagement by using bibliometric and qualitative content analyses. Extracting data from the electronic databases more likely to...
متن کاملTowards Extracting Detailed Metadata from Academic Research Articles
In this paper we present the results of a pilot project that addressed the problem of extracting relationships between publications, datasets, and methodologies. Our ultimate goal is to automatically extract details about specific datasets used, including variables and time periods as well as methodologies and techniques. The extracted metadata will help both researchers and organizations perfo...
متن کامل